determining vulnerable areas of malekan plain aquifer for nitrate, using random forest method

نویسندگان

حسین نوروزی

کارشناسی ارشد هیدروژئولوژی، دانشکدة علوم طبیعی، دانشگاه تبریز اصغر اصغری مقدم

استاد هیدروژئولوژی، دانشکدة علوم طبیعی، دانشگاه تبریز عطاالله ندیری

استادیار هیدروژئولوژی، دانشکدة علوم طبیعی، دانشگاه تبریز،

چکیده

determining vulnerable areas of malekan plain aquifer for nitrate, using random forest method introduction:management of groundwater, especially in dry regions such as iran, is essential and this concern becomes further with development of agriculture, industry, population growth and climate changes, that affecting the quality and quantity of groundwater resources. hence, groundwater contamination can treat the human health. since groundwater moves slowly through the subsurface, the impact of anthropogenic activities may last for a relatively long time and for that reason, the environmental measures should be mainly focused on the prevention of the contamination. one of the ways to prevent of groundwater contamination is identifying vulnerable regions of aquifers and management of land use. the assessment of groundwater vulnerability maps requires the application of diverse methods and techniques, based on the hydrogeological knowledge of the region under research and on the application of predictive models. with the aim of deciding which areas are vulnerable a large data volume can be collected which cannot be effectively analyzed without an adequate and efficient model. several methods have been devised to vulnerability mapping that relatively using fewer data and based on evidence of contamination. in this study to overcoming the problems of other methods the random forest (rf) algorithms is proposed.materials and methods:malekan plain is located in east azarbaijan province, southeast of urmia lake, northwest of iran, with 450 km2. this region is one of the very active cultivated areas which its water demands supply by groundwater resources. in recent years groundwater quality of the area is encountered with degradation problem. malekan region have different geological formations such as lalon, shemshak, lar formations, and a large part of the area in the western part is an alluvial deposits of quaternary. aquifer of this plain is unconfined, which mainly formed by old and recent alluvial terraces, alluvial fans and fluvial sediments. based on drilling wells logs and geophysical data, the west part of the plain is made of fine grained material with low permeable. according to farming and existing of grape farms in this region and intensive use of fertilizers and manure the groundwater nitrate concentration of the aquifer is high (figure 1).to evaluate the quality of groundwater resources, especially the assessment of nitrate anomalies in groundwater of the malekan plain, 27 samples were collected from groundwater resources in september 2014, and hydrochemical analysis were carried out in hydrology laboratory of tabriz university. in this study the random forest (rf) algorithms, which is a learning method based on ensemble of decision trees, is proposed. the rf technique has advantages over other methods due to having, high prediction accuracy, ability to learn nonlinear relationships and ability to determine the important variables in the prediction. in this paper rf method is used to estimate the malekan aquifer vulnerability, with four sets of data, including a model with all variables, b model with variables related to characteristics of the aquifer, c model with driving forces variables, and d model with variables related to the drastic method. the predictions derived from all possible parameter combinations were evaluated using the root mean square error (rmse) and mean square error. the area under the curve statistic (auc) was used to determine which models and which combination of dataset performed better. an auc value of 1 is considered perfect. fig1. spatial distribution of nitrate concentrationresults and discussions:from 23 explanatory variables used in model, five variables (depth to water table, hydraulic conductivity, distance to grape farms, hydraulic gradient and transmissivity) can describe the nitrates behavior in the malekan plain aquifer with more accuracy, since a smaller mse was obtained. in order to obtain continuous and standardized variables for all area of the study, all data were transformed into a raster format, and where were applied mainly three different approaches: 1) geostatistical techniques (e.g. hydraulic conductivity, hydraulic gradient and soil texture), 2) euclidian distance raster calculations (potential point sources of contamination) and 3) classification of land cover from remotely sensed data and ndvi. in this paper rf method is used to estimate the malekan aquifer vulnerability, with four sets of data, including a model with all variables, b model with variables related to characteristics of the aquifer, c model with driving forces variables, and d model with variables related to the drastic method. in order to set the value of k from which the error converges and which also makes estimation more reliable, models made up of 1000 trees were generated from all explanatory variables. the parameter was optimized by varying the number of split variables between 1 and the maximum number of variables of every subset. the resulting models were evaluated using the oob error estimation. for the selection of the most accurate model the one in which the oob error was the lowest is determined. moreover, with the aim of reducing the dimensionality and improve the accuracy and interpretability of models, a fs strategy was adopted. the most significant predictive features were selected by using the importance measures of rf. the least significant explanatory variables of every subset were reduced until reaching the minimum error rate. nitrate concentration was rescaled to a new response variable for every experimental sample: samples with nitrate concentrations higher or equal to the threshold value were given a value equal to 1 and samples lower to the threshold a value equal to 0. the explanative variables (predictors) and response variable were combined together into a set of input feature vectors. these vectors formed the input to the rf algorithm and are known as input-feature vectors. the binary response variable (nitrate pollution) was used as target values for the training of the algorithm. in this study, which four models were used to predict nitrate contamination of groundwater, as shown in fig2, a and b models, respectively with rmse equal to 0/11157 and 0/12214, predicted approximately 44 and 42 percent of the region's in the high vulnerability that located in the central and eastern parts of the aquifer. however c and d models, respectively with rmse equal to 0/1392 and 0/1597, predicted approximately 15 and 24 percent of the region's in the high vulnerability and could not be trusted in assessment of groundwater vulnerability. fig 2. vulnerability map of the four models. a) all variables, b) variables related to characteristics of the aquifer, c) driving forces variables, and d) variables related to the drastickeywords: groundwater, malekan plain, nitrate, vulnerability, random forest

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Groundwater Flow and Salinity Intrusion Simulation in Malekan Plain Aquifer

Due to groundwater overextraction, during the recent years, Malekan plain aquifer has been faced to a risk of increasing salination. The groundwater flow model of the plain was designed, invesitagion of groundwater flow regime and precise hydraulic parameters estimation, utilizing the various method such as  geological surveys, tomography methods, hydrochemical analysis and simulation of aquife...

متن کامل

Simulation of Nitrate Concentration in Aquifer in Qazvin Plain Using Groundwater Modeling System

Background Nitrate is among major anions in drinking water. Therefore, it is crucial to investigate its concentration in water resources. Modeling is a management strategy to predict the behavior of nitrate in water resources. Objective The current study aimed to predict nitrate concentration in the aquifer of Qazvin plain, using Groundwater Modeling System (GMS). Methods The GMS7.1 software ...

متن کامل

Estimation of Subsidence Potential Index Using the PCSM Method and Fuzzy Model in Ardabil Plain Aquifer

Recently, land subsidence due to natural and human factors changed to catastrophic destruction for the residential, agricultural and industrial areas. In this study, the high potential subsidence areas of Ardabil plain were identified to control and manage this phenomenon. Thus, the seven effective parameters on the subsidence were rated and weighted and the subsidence potential index (SPI) was...

متن کامل

3D Detection of Power-Transmission Lines in Point Clouds Using Random Forest Method

Inspection of power transmission lines using classic experts based methods suffers from disadvantages such as highel level of time and money consumption. Advent of UAVs and their application in aerial data gathering help to decrease the time and cost promenantly. The purpose of this research is to present an efficient automated method for inspection of power transmission lines based on point c...

متن کامل

Zoning of groundwater contaminated by Nitrate using geostatistics method (case study: Bahabad plain, Yazd, Iran)

Groundwater quality management is one of the most important issues in many arid and semi-arid regions, including Iran.Nitrate (NO3-) is one of the most common anions contaminating groundwater. This study aimed to range nitrateconcentrations in water resources in Bahabad plain in Yazd province. To evaluate the nitrate data in this descriptive study,260 nitrate samples from 13 wells in Bahabad we...

متن کامل

Identification and Prioritization of Suitable Areas for Artificial Nutrition of Groundwater using GIS and AHP Method (Case Study: Tajan Plain)

In the coastal region reduction of groundwater is one of the most important problems. Excessive exploitation from groundwater is one of the main effective reasons that will lead to the progress of saline water, especially in seasons with low precipitation. Therefore, at the current research, identification of suitable areas for artificial groundwater recharge in the Tajan plain (with an area of...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید


عنوان ژورنال:
محیط شناسی

جلد ۴۱، شماره ۴، صفحات ۹۲۳-۹۴۲

کلمات کلیدی
determining vulnerable areas of malekan plain aquifer for nitrate using random forest method introduction:management of groundwater especially in dry regions such as iran is essential and this concern becomes further with development of agriculture industry population growth and climate changes that affecting the quality and quantity of groundwater resources. hence groundwater contamination can treat the human health. since groundwater moves slowly through the subsurface the impact of anthropogenic activities may last for a relatively long time and for that reason the environmental measures should be mainly focused on the prevention of the contamination. one of the ways to prevent of groundwater contamination is identifying vulnerable regions of aquifers and management of land use. the assessment of groundwater vulnerability maps requires the application of diverse methods and techniques based on the hydrogeological knowledge of the region under research and on the application of predictive models. with the aim of deciding which areas are vulnerable a large data volume can be collected which cannot be effectively analyzed without an adequate and efficient model. several methods have been devised to vulnerability mapping that relatively using fewer data and based on evidence of contamination. in this study to overcoming the problems of other methods the random forest (rf) algorithms is proposed.materials and methods:malekan plain is located in east azarbaijan province southeast of urmia lake northwest of iran with 450 km2. this region is one of the very active cultivated areas which its water demands supply by groundwater resources. in recent years groundwater quality of the area is encountered with degradation problem. malekan region have different geological formations such as lalon shemshak lar formations and a large part of the area in the western part is an alluvial deposits of quaternary. aquifer of this plain is unconfined which mainly formed by old and recent alluvial terraces alluvial fans and fluvial sediments. based on drilling wells logs and geophysical data the west part of the plain is made of fine grained material with low permeable. according to farming and existing of grape farms in this region and intensive use of fertilizers and manure the groundwater nitrate concentration of the aquifer is high (figure 1).to evaluate the quality of groundwater resources especially the assessment of nitrate anomalies in groundwater of the malekan plain 27 samples were collected from groundwater resources in september 2014 and hydrochemical analysis were carried out in hydrology laboratory of tabriz university. in this study the random forest (rf) algorithms which is a learning method based on ensemble of decision trees is proposed. the rf technique has advantages over other methods due to having high prediction accuracy ability to learn nonlinear relationships and ability to determine the important variables in the prediction. in this paper rf method is used to estimate the malekan aquifer vulnerability with four sets of data including a model with all variables b model with variables related to characteristics of the aquifer c model with driving forces variables and d model with variables related to the drastic method. the predictions derived from all possible parameter combinations were evaluated using the root mean square error (rmse) and mean square error. the area under the curve statistic (auc) was used to determine which models and which combination of dataset performed better. an auc value of 1 is considered perfect. fig1. spatial distribution of nitrate concentrationresults and discussions:from 23 explanatory variables used in model five variables (depth to water table hydraulic conductivity distance to grape farms hydraulic gradient and transmissivity) can describe the nitrates behavior in the malekan plain aquifer with more accuracy since a smaller mse was obtained. in order to obtain continuous and standardized variables for all area of the study all data were transformed into a raster format and where were applied mainly three different approaches: 1) geostatistical techniques (e.g. hydraulic conductivity hydraulic gradient and soil texture) 2) euclidian distance raster calculations (potential point sources of contamination) and 3) classification of land cover from remotely sensed data and ndvi. in this paper rf method is used to estimate the malekan aquifer vulnerability with four sets of data including a model with all variables b model with variables related to characteristics of the aquifer c model with driving forces variables and d model with variables related to the drastic method. in order to set the value of k from which the error converges and which also makes estimation more reliable models made up of 1000 trees were generated from all explanatory variables. the parameter was optimized by varying the number of split variables between 1 and the maximum number of variables of every subset. the resulting models were evaluated using the oob error estimation. for the selection of the most accurate model the one in which the oob error was the lowest is determined. moreover with the aim of reducing the dimensionality and improve the accuracy and interpretability of models a fs strategy was adopted. the most significant predictive features were selected by using the importance measures of rf. the least significant explanatory variables of every subset were reduced until reaching the minimum error rate. nitrate concentration was rescaled to a new response variable for every experimental sample: samples with nitrate concentrations higher or equal to the threshold value were given a value equal to 1 and samples lower to the threshold a value equal to 0. the explanative variables (predictors) and response variable were combined together into a set of input feature vectors. these vectors formed the input to the rf algorithm and are known as input feature vectors. the binary response variable (nitrate pollution) was used as target values for the training of the algorithm. in this study which four models were used to predict nitrate contamination of groundwater as shown in fig2 a and b models respectively with rmse equal to 0/11157 and 0/12214 predicted approximately 44 and 42 percent of the region's in the high vulnerability that located in the central and eastern parts of the aquifer. however c and d models respectively with rmse equal to 0/1392 and 0/1597 predicted approximately 15 and 24 percent of the region's in the high vulnerability and could not be trusted in assessment of groundwater vulnerability. fig 2. vulnerability map of the four models. a) all variables b) variables related to characteristics of the aquifer c) driving forces variables and d) variables related to the drastickeywords: groundwater malekan plain nitrate vulnerability random forest

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023